-
Notifications
You must be signed in to change notification settings - Fork 545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ocp] Fix backtrace in ocp collector #3515
base: main
Are you sure you want to change the base?
Conversation
Congratulations! One of the builds has completed. 🍾 You can install the built RPMs by following these steps:
Please note that the RPMs should be used only in a testing environment. |
9b5969c
to
a11cba6
Compare
verified the fix in my lab environment, lgtm |
sos/collector/clusters/ocp.py
Outdated
'which oc', chroot=self.primary.host.sysroot | ||
'which oc', need_root=True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are very much not the same thing, and this is not the correct fix.
chroot
means "run this command in a chroot, so that we aren't in '/
'". The conditional before this makes it obvious that we're trying to make sure that we run in the host's mounted filesystem, not inside the container.
need_root
simply makes sure we're executing as the root user, or otherwise we use `sudo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are not the same thing, but run_command() doesn't provide a 'chroot' argument. It does provide a 'need_root' argument, and it's used in other places in ocp.py, like in line #349. Should we instead rewrite run_command() to accept a chroot argument?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what should be the right way of detecting oc
binary inside a container?
Should we search for the binary on the host (that sounds meaningful to me, as oc
is the tool to manage containers, so it should run on host and not necessarily in container, am I right?)?
Or should we search for the binary under root
(that does work in practise)?
When I am in a container, what is the correct way of detecting path to (the right?) oc
binary?
If we dont know 100% answer to the latest question, I lean towards the current fix - which fixes missing parameter of a method (that never works) by something that works (at least in the tested use cases).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The correct way is to look in the mounted root filesystem inside the container, which would be passed by the HOST
or similar env var and be known to sos. This is what the chroot
is intended for.
The current change does not actually fix that binary discovery. It simply runs which oc
. The needs_root
here is meaningless as we'd already be root in the container.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so I'll remove that part and resend the PR with only the fix that solves the backtrace.
While runnig sos collector in an ocp cluster we got the following trace: Traceback (most recent call last): File "/usr/sbin/sos", line 22, in <module> sos.execute() File "/usr/lib/python3.6/site-packages/sos/__init__.py", line 193, in execute self._component.execute() File "/usr/lib/python3.6/site-packages/sos/collector/__init__.py", line 1175, in execute self.prep() File "/usr/lib/python3.6/site-packages/sos/collector/__init__.py", line 896, in prep self.cluster.setup() File "/usr/lib/python3.6/site-packages/sos/collector/clusters/ocp.py", line 132, in setup out = self.exec_primary_cmd(self.fmt_oc_cmd("auth can-i '*' '*'")) File "/usr/lib/python3.6/site-packages/sos/collector/clusters/ocp.py", line 104, in fmt_oc_cmd return "%s %s" % (self.oc_cmd, cmd) File "/usr/lib/python3.6/site-packages/sos/collector/clusters/ocp.py", line 81, in oc_cmd 'which oc', chroot=self.primary.host.sysroot TypeError: run_command() got an unexpected keyword argument 'chroot' This patch changes the unused option 'chroot' that was undefined in run_command() to the option need_root that was in use in other commands in ocp.py. Related: RH: RHEL-24351 Co-authored-by: Alberto Losada Grande <[email protected]> Signed-off-by: Jose Castillo <[email protected]>
a11cba6
to
fc8da6a
Compare
I removed |
Is propr |
The default of 'chroot' is False iIrc, so Alberto's test should tell us more for sure |
Kindly requesting @alosadagrande for a test.. |
I do not have the environment anymore. I'll try to ask for a new one, but it can take a couple of weeks |
@alosadagrande @jcastill any update on this one |
The only blocker in this one is to test the latest fix. I'll try to set up an environment in the lab and check it soon. |
While runnng sos collector in an ocp cluster
we got the following trace:
Traceback (most recent call last):
File "/usr/sbin/sos", line 22, in
sos.execute()
File "/usr/lib/python3.6/site-packages/sos/init.py",
line 193, in execute
self._component.execute()
File "/usr/lib/python3.6/site-packages/sos/collector/init.py",
line 1175, in execute
self.prep()
File "/usr/lib/python3.6/site-packages/sos/collector/init.py",
line 896, in prep
self.cluster.setup()
File "/usr/lib/python3.6/site-packages/sos/collector/clusters/ocp.py",
line 132, in setup
out = self.exec_primary_cmd(self.fmt_oc_cmd("auth can-i '' ''"))
File "/usr/lib/python3.6/site-packages/sos/collector/clusters/ocp.py",
line 104, in fmt_oc_cmd
return "%s %s" % (self.oc_cmd, cmd)
File "/usr/lib/python3.6/site-packages/sos/collector/clusters/ocp.py",
line 81, in oc_cmd
'which oc', chroot=self.primary.host.sysroot
TypeError: run_command() got an unexpected keyword argument 'chroot'
This patch changes the unused option 'chroot' that was undefined in run_command()
to the option need_root that was in use in other commands in ocp.py.
Related: RH: RHEL-24351
Please place an 'X' inside each '[]' to confirm you adhere to our Contributor Guidelines